15 research outputs found

    A Novel Partitioning Method for Accelerating the Block Cimmino Algorithm

    Get PDF
    We propose a novel block-row partitioning method in order to improve the convergence rate of the block Cimmino algorithm for solving general sparse linear systems of equations. The convergence rate of the block Cimmino algorithm depends on the orthogonality among the block rows obtained by the partitioning method. The proposed method takes numerical orthogonality among block rows into account by proposing a row inner-product graph model of the coefficient matrix. In the graph partitioning formulation defined on this graph model, the partitioning objective of minimizing the cutsize directly corresponds to minimizing the sum of inter-block inner products between block rows thus leading to an improvement in the eigenvalue spectrum of the iteration matrix. This in turn leads to a significant reduction in the number of iterations required for convergence. Extensive experiments conducted on a large set of matrices confirm the validity of the proposed method against a state-of-the-art method

    A domain decomposing parallel sparse linear system solver

    Get PDF
    The solution of large sparse linear systems is often the most time-consuming part of many science and engineering applications. Computational fluid dynamics, circuit simulation, power network analysis, and material science are just a few examples of the application areas in which large sparse linear systems need to be solved effectively. In this paper we introduce a new parallel hybrid sparse linear system solver for distributed memory architectures that contains both direct and iterative components. We show that by using our solver one can alleviate the drawbacks of direct and iterative solvers, achieving better scalability than with direct solvers and more robustness than with classical preconditioned iterative solvers. Comparisons to well-known direct and iterative solvers on a parallel architecture are provided.Comment: To appear in Journal of Computational and Applied Mathematic

    Parallel scalable PDE-constrained optimization: antenna identification in hyperthermia cancer treatment planning

    Get PDF
    We present aPDE-constrained optimization algorithm which is designed for parallel scalability on distributed-memory architectures with thousands of cores. The method is based on aline-search interior-point algorithm for large-scale continuous optimization, it is matrix-free in that it does not require the factorization of derivative matrices. Instead, it uses anew parallel and robust iterative linear solver on distributed-memory architectures. We will show almost linear parallel scalability results for the complete optimization problem, which is anew emerging important biomedical application and is related to antenna identification in hyperthermia cancer treatment plannin

    Parallel hybrid sparse system solvers

    No full text
    We present a family of hybrid algorithms that are suitable for the solution of large sparse linear systems on parallel computing platforms. This study is motivated by the lack of robustness of Krylov subspace iterative schemes with black-box preconditioners, such as incomplete LU-factorizations and the lack of scalability of direct sparse system solvers. Our hybrid solver is as robust as direct solvers and as scalable as iterative solvers whose preconditioners are both effective and scalable. Our method relies on weighted symmetric and nonsymmetric matrix reordering for bringing the largest elements on or closer to the main diagonal resulting in a very effective extracted banded preconditioner. Systems involving the extracted banded preconditioner are solved via a member of the recently developed SPIKE family of algorithms. The effectiveness of our method is demonstrated by solving large sparse linear systems that arise in various applications such as computational fluid dynamics, oil reservoir simulations, and nonlinear optimizations. Finally, we present a highly accurate method for predicting the parallel scalability of our system solver on architectures with more nodes than the platform on which our experiments have been performed

    A General Sparse Sparse Linear System Solver and Its Application in OpenFOAM

    No full text
    Solution of large sparse linear systems is frequently the most time consuming operation in computational fluid dynamics simulations. Improving the scalability of this operation is likely to have significant impact on the overall scalability of application. In this white paper we show scalability results up to a thousand cores for a new algorithm devised to solve large sparse linear systems. We have also compared pure MPI vs. MPI-OpenMP hybrid implementation of the same algorithm

    Weighted Matrix Ordering And Parallel Banded Preconditioners For Iterative Linear System Solvers

    Get PDF
    The emergence of multicore architectures and highly scalable platforms motivates the development of novel algorithms and techniques that emphasize concurrency and are tolerant of deep memory hierarchies, as opposed to minimizing raw FLOP counts. While direct solvers are reliable, they are often slow and memory-intensive for large problems. Iterative solvers, on the other hand, are more efficient but, in the absence of robust preconditioners, lack reliability. While preconditioners based on incomplete factorizations ( whenever they exist) are effective for many problems, their parallel scalability is generally limited. In this paper, we advocate the use of banded preconditioners instead and introduce a reordering strategy that enables their extraction. In contrast to traditional bandwidth reduction techniques, our reordering strategy takes into account the magnitude of the matrix entries, bringing the heaviest elements closer to the diagonal, thus enabling the use of banded preconditioners. When used with effective banded solvers-in our case, the Spike solver-we show that banded preconditioners (i) are more robust compared to the broad class of incomplete factorization-based preconditioners, (ii) deliver higher processor performance, resulting in faster time to solution, and (iii) scale to larger parallel configurations. We demonstrate these results experimentally on a large class of problems selected from diverse application domains

    PSPIKE: A Parallel Hybrid Sparse Linear System Solver

    No full text
    The availability of large-scale computing platforms comprised of tens of thousands of multicore processors motivates the need for the next generation of highly scalable sparse linear system solvers. These solvers must optimize parallel performance, processor (serial) performance, as well as memory requirements, while being robust across broad classes of applications and systems. In this paper, we present a new parallel solver that combines the desirable characteristics of direct methods (robustness) and effective iterative solvers (low computational cost), while alleviating their drawbacks (memory requirements, lack of robustness). Our proposed hybrid solver is based on the general sparse solver PARDISO, and the “Spike” family of hybrid solvers. The resulting algorithm, called PSPIKE, is as robust as direct solvers, more reliable than classical preconditioned Krylov subspace methods, and much more scalable than direct sparse solvers. We support our performance and parallel scalability claims using detailed experimental studies and comparison with direct solvers, as well as classical preconditioned Krylov methods
    corecore